Dwq : Esprit Long Term Research Project, No 22469 Deciding Equivalences among Aggregate Queries Deciding Equivalences among Aggregate Queries
نویسندگان
چکیده
Equivalence of aggregate queries is investigated for the class of conjunctive queries with comparisons and the aggregate operators min, max, count, count-distinct, and sum. Essentially, this class contains all unnested SQL queries with the above aggregate operators, with a WHERE clause consisting of a conjunction of comparisons, and without a HAVING clause. The comparisons can be interpreted over either a dense order (e.g., over the rationals) or a discrete order (e.g., over the integers). Generally, however, di erent techniques and characterizations are needed in each of these two cases. For queries with either max or min, equivalence is characterized in terms of dominance mappings, which can be viewed as a generalization of containment mappings. For queries with the count-distinct operator, a su cient condition for equivalence is given in terms of equivalence of conjunctive queries under set semantics. For some special cases, it is shown that this condition is also necessary. For conjunctive queries with comparisons but without aggregation, equivalence under bag-set semantics is characterized in terms of isomorphism. This characterization essentially remains the same also for queries with the count operator. Moreover, this characterization also applies to queries with the sum operator if the queries have either constants or comparisons, but not both. In the general case (i.e., both comparisons and constants), the characterization of the equivalence of queries with the sum operator is more elaborate. All the characterizations given in the paper are decidable with polynomial space. Finally, it is shown that all the characterizations for min-, max-, count-, and sum-queries yield polynomial-time algorithms for linear queries, i.e., queries with no repeated predicates in their bodies.
منابع مشابه
Dwq : Esprit Long Term Research Project, No 22469 Representing and Reasoning on Sgml Documents Representing and Reasoning on Sgml Documents
In this paper, we address the issue of representing and reasoning about documents for which an explicit structure is provided. Speci cally, we devise a framework where Document Type De nitions (DTDs) expressed in the Standard Generalized Markup Language (SGML) are formalized in an expressive Description Logic equipped with sound, complete, and terminating inference procedures. In this way, we p...
متن کاملEquivalence , Containment and Rewriting of Aggregate Queries
The primary goal of this thesis is to lay the theoretical foundations for a formal study of aggregate query optimization. This requires gaining a coherent understanding of equivalences and containments between aggregate queries of varied forms. A secondary goal of this thesis is to solve the view usability problem for varied types of aggregate queries. The view usability problem is that of dete...
متن کاملAxiomatic Foundations and Algorithms for Deciding Semantic Equivalences of SQL Queries
Deciding the equivalence of SQL queries is a fundamental problem in data management. As prior work has mainly focused on studying the theoretical limitations of the problem, very few implementations for checking such equivalences exist. In this paper, we present a new formalism and implementation for reasoning about the equivalences of SQL queries. Our formalism, U-semiring, extends SQL’s semir...
متن کاملDwq : Esprit Long Term Research Project, No 22469 Foundations of Data Warehouse Quality Data Warehouse Connguration
In the data warehousing approach to the integration of data from multiple information sources, selected information is extracted in advance and stored in a repository. A data warehouse (DW) can therefore be seen as a set of materialized views de ned over the sources. When a query is posed, it is evaluated locally, using the materialized views, without accessing the original information sources....
متن کاملارائه روشی پویا جهت پاسخ به پرسوجوهای پیوسته تجمّعی اقتضایی
Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998